SIP-72: dedented triple-quoted string literals #24185

lihaoyi · 2025-10-15T08:29:47Z

Implements scala/improvement-proposals#112

The initial Lexing/Scanning is lenient, only looking for the opening ''''* and equivalent closing delimiter. This matches how we can expect this to be implemented in other tools that have more restricted lexing frameworks (IntelliJ w/ JFlex, VSCode w/ TextMate Grammars, NeoVim w/ TreeSitter)
All other validation (opening delimiter must be followed by newline, closing delimiter must be preceded by whitespace only) and de-denting is left to the parsing phase, which is the only time we have a complete string "literal" when an interpolator is present, and thus are able to look at the trailing delimiter's preceding indent whitespace and trim it from all earlier STRINGLIT/STRINGPART tokens
def interpolatedString needed to be refactored to support dedenting: rather than constructing the trees immediately, we first assemble all the strings parts, then use the last string part to compute the dedent that we apply to all other parts, and only then do we construct the trees
Covered by neg/ tests and run/ tests for all the major features and edge cases I could think of:
- All indentation removed
- Some indentation preserved
- Empty strings
- Single-line strings
- Blank lines in the string
- Leading and trailing blank lines
- Varying indentation
- Extensible delimiters with 4 and 5 quotes
- Funky operator and unicode characters in the string
- Tab-based indentation
- Interpolation with s and f
- Single- and Multi-line pattern matching with and without interpolation
- In larger expressions: lists, infix operators, etc.
- As singleton-type ascriptions and singleton-type parameters
- As literals passed to @compileTimeOnly
I haven't managed to reliably run tests for some reason, I think I'm bumping into https://contributors.scala-lang.org/t/current-testcompilation/7256. But I tested it manually by copy-pasting the run/neg test files into the bin/scala REPL and compared the output manually with the .check files on disk, and the output is identical

odersky · 2025-10-15T08:34:21Z

That was fast!

lihaoyi · 2025-10-15T08:46:51Z

Not ready to review yet! Still need a bit more vibing haha

Gedochao · 2025-10-15T08:49:01Z

Not ready to review yet! Still need a bit more vibing haha

Ah right, I'll convert it to draft then

He-Pin · 2025-10-15T16:47:04Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

+
+        val hasTabs = closingIndent.contains('\t')
+        val hasSpaces = closingIndent.contains(' ')
+        if (hasTabs && hasSpaces) {


Should be able to detect this in one loop

odersky · 2025-10-15T17:02:55Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

      else
        literal(inTypeOrSingleton = true)

+    /** Dedent a string literal by removing common leading whitespace.


For new code in the compiler we use indentation syntax and new conditional if / then / else syntax. The old Java conditional syntax is already disabled under -language.future.

lihaoyi · 2025-10-15T17:09:26Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

+      val isDedented =
+        in.charOffset + 2 < in.buf.length &&
+        in.buf(in.charOffset - 1) == '\'' &&
+        in.buf(in.charOffset) == '\'' &&
+        in.buf(in.charOffset + 1) == '\''
      in.nextToken()
-      def nextSegment(literalOffset: Offset) =
-        segmentBuf += Thicket(
-            literal(literalOffset, inPattern = inPattern, inStringInterpolation = true),
-            atSpan(in.offset) {
-              if (in.token == IDENTIFIER)
-                termIdent()
-              else if (in.token == USCORE && inPattern) {
-                in.nextToken()
-                Ident(nme.WILDCARD)
-              }
-              else if (in.token == THIS) {
-                in.nextToken()
-                This(EmptyTypeIdent)
-              }
-              else if (in.token == LBRACE)
-                if (inPattern) Block(Nil, inBraces(pattern()))
-                else expr()
-              else {
-                report.error(InterpolatedStringError(), source.atSpan(Span(in.offset)))
-                EmptyTree
-              }
-            })

-      var offsetCorrection = if isTripleQuoted then 3 else 1
-      while (in.token == STRINGPART)
-        nextSegment(in.offset + offsetCorrection)
+      // Collect all string parts and their offsets
+      val stringParts = new ListBuffer[(String, Offset)]
+      val interpolatedExprs = new ListBuffer[Tree]
+
+      var offsetCorrection = if (isDedented) 3 else if (isTripleQuoted) 3 else 1


This bit is super sketchy, I'm sure there's a better way

lihaoyi · 2025-10-25T00:44:09Z

Marking this as ready to review since the SIP is has been voted into experimental phase

lihaoyi · 2025-10-25T05:25:41Z

@odersky @sjrd looks like most of the tests are green, this should be ready for review. I couldn't find any error message in the failing job's logs, might need help from someone more familiar with the CI setup to help take a look

lihaoyi · 2025-10-25T10:56:32Z

The last failure turned out to be due to error message positioning issues, fixed that and it's all green now

noti0na1 · 2025-10-27T13:07:38Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

+
+      val isDedented =
+        in.charOffset + 2 < in.buf.length &&
+        in.buf(in.charOffset - 1) == '\'' &&


Why do we look at offset - 1 here? And we should be able to reuse the logic of isDedentedStringLiteral?

noti0na1 · 2025-10-27T13:10:14Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

+     *
+     *  @param str The string content to dedent
+     *  @param offset The source offset where the string literal begins
+     *  @return The dedented string, or str if errors were reported


Given the importance of other parameters, it would be better to explain them as well, maybe with an example.

noti0na1 · 2025-10-27T13:22:36Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

    }

+    /** Extract the closing indentation from the last line of a string */
+    private def extractClosingIndent(str: String, offset: Offset): (String, Boolean) = {


I would use an Option[String] for the result.

noti0na1 · 2025-10-27T13:23:47Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

+        val linesAndWithSeps = (str.linesIterator.zip(str.linesWithSeparators)).toSeq
+        var lineOffset = offset
+        // start counting error location offsets only after opening delimiter
+        while(in.buf(lineOffset) == '\'') lineOffset += 1


missing a space after while

noti0na1 · 2025-10-27T13:31:09Z

tests/run/dedented-string-literals.scala

@@ -0,0 +1,288 @@
+// Test runtime behavior of dedented string literals


Can you add the following small tests:

val x = s''' ${"content with\nnewline"} more text ''' val nested = s''' outer ${''' inner '''} ''' val nested2 = s''' outer ${s''' inner with $x '''} '''

.

9eb87a2

Gedochao requested review from odersky and sjrd October 15, 2025 08:33

Gedochao changed the title ~~WIP dedented triple-quoted string literals~~ SIP-72: WIP dedented triple-quoted string literals Oct 15, 2025

Gedochao added needs-minor-release This PR cannot be merged until the next minor release needs-sip A SIP needs to be raised to move this issue/PR along. stat:sip-in-progress and removed needs-sip A SIP needs to be raised to move this issue/PR along. labels Oct 15, 2025

Gedochao marked this pull request as draft October 15, 2025 08:49

lihaoyi added 16 commits October 15, 2025 17:47

.

5109265

.

ab9a589

.

00f04b8

.

40f397f

.

48680cc

.

3a36a0f

.

b181814

.

7e8e5a7

.

c9fbf70

.

aa18b7e

.

17205d9

.

300f300

.

5c8c892

.

b687fcc

wip

3ea3e7e

.

b1613c7

He-Pin reviewed Oct 15, 2025

View reviewed changes

.

2fd9e0e

odersky reviewed Oct 15, 2025

View reviewed changes

lihaoyi commented Oct 15, 2025

View reviewed changes

lihaoyi added 4 commits October 16, 2025 01:10

.

f83defe

.

ac4c475

wip

4f2c7f4

Merge branch 'main' into dedented-strings

606e37a

lihaoyi marked this pull request as ready for review October 25, 2025 00:43

lihaoyi added 3 commits October 25, 2025 08:58

consolidate interpolator parsing

02f9cf2

.

8862764

cleanup

3aefb59

lihaoyi force-pushed the dedented-strings branch from 809656c to 3aefb59 Compare October 25, 2025 01:54

lihaoyi added 3 commits October 25, 2025 09:58

cleanup

c288c38

.

68e1742

wip

062b3ef

lihaoyi changed the title ~~SIP-72: WIP dedented triple-quoted string literals~~ SIP-72: dedented triple-quoted string literals Oct 25, 2025

lihaoyi added 3 commits October 25, 2025 13:43

Update dedented-string-literals.scala

3ea6416

.

049b6cb

.

f4a507f

Gedochao requested a review from odersky October 27, 2025 08:00

Gedochao assigned sjrd and odersky and unassigned odersky Oct 27, 2025

Gedochao requested a review from noti0na1 October 27, 2025 08:26

Gedochao assigned noti0na1 Oct 27, 2025

noti0na1 requested changes Oct 27, 2025

View reviewed changes

Gedochao requested a review from hamzaremmal October 27, 2025 13:37

Gedochao assigned hamzaremmal Oct 27, 2025

		@@ -0,0 +1,288 @@
		// Test runtime behavior of dedented string literals

Uh oh!

SIP-72: dedented triple-quoted string literals #24185

Are you sure you want to change the base?

SIP-72: dedented triple-quoted string literals #24185

Conversation

lihaoyi commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

odersky commented Oct 15, 2025

Uh oh!

lihaoyi commented Oct 15, 2025

Uh oh!

Gedochao commented Oct 15, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lihaoyi commented Oct 25, 2025

Uh oh!

lihaoyi commented Oct 25, 2025

Uh oh!

lihaoyi commented Oct 25, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

lihaoyi commented Oct 15, 2025 •

edited

Loading